Improved Nearest Neighbor Based Approach to Accurate Document Skew Estimation

نویسندگان

  • Yue Lu
  • Chew Lim Tan
چکیده

The nearest-neighbor based document skew detection methods do not require the presence of a predominant text area, and are not subject to skew angle limitation. However, the accuracy of these methods is not perfect in general. In this paper, we present an improved nearest-neighbor based approach to perform accurate document skew estimation. Size restriction is introduced to the detection of nearest-neighbor pairs. Then the chains with a largest possible number of nearest-neighbor pairs are selected, and their slopes are computed to give the skew angle of document image. Experimental results on various types of documents containing different linguistic scripts and diverse layouts show that the proposed approach has achieved an improved accuracy for estimating document image skew angle and has an advantage of being language independent.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A nearest-neighbor chain based approach to skew estimation in document images

A nearest-neighbor chain (NNC) based approach is proposed in this paper to develop a skew estimation method with a high accuracy and with language-independent capability. Size restriction is introduced to the detection of nearest-neighbors (NN). Then NNCs are extracted from the adjacent NN pairs, in which the slopes of the NNCs with a largest possible number of components are computed to give t...

متن کامل

Skew Estimation by Parts

This paper proposes a new part-based approach for skew estimation of document images. The proposed method first estimates skew angles on rather small areas, which are the local parts of characters, and subsequently determines the global skew angle by aggregating those local estimations. A local skew estimation on a part of a skewed character is performed by finding an identical part from prepar...

متن کامل

Software Cost Estimation by a New Hybrid Model of Particle Swarm Optimization and K-Nearest Neighbor Algorithms

A successful software should be finalized with determined and predetermined cost and time. Software is a production which its approximate cost is expert workforce and professionals. The most important and approximate software cost estimation (SCE) is related to the trained workforce. Creative nature of software projects and its abstract nature make extremely cost and time of projects difficult ...

متن کامل

Estimation of Density using Plotless Density Estimator Criteria in Arasbaran Forest

    Sampling methods have a theoretical basis and should be operational in different forests; therefore selecting an appropriate sampling method is effective for accurate estimation of forest characteristics. The purpose of this study was to estimate the stand density (number per hectare) in Arasbaran forest using a variety of the plotless density estimators of the nearest neighbors sampling me...

متن کامل

An Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification

The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003